Details on Dataset
This analysis will be on the earning potential of various college
majors after graduation. The data is from the American Community Survey
2010-2012 Public Use Microdata Series. It contains data from a survey of
college graduates and the majors they graduated in, along with the links
to earning potential and their employment information. It spans 173
individual undergraduate majors across 16 major categories.
Primary Objectives
- Find out which major categories are the most popular among
students
- Find the gender breakdown for each major category
- Find out which individual majors are the most popular (Top 15)
- Find the gender breakdown for both sets of major rankings
- Analyze the median earning potential across major categories
- Analyze the median earning potential for the Top 15 majors
- Analyze the distribution of median earnings across all majors
- Visualize the effect of taking different sample sizes has on the
average median earnings
- Visualize the effect different sampling methods have on the average
median earnings
- Analyze unemployment rates of Engineering and Computers &
Mathematics majors
Major Category Analysis
The following will be an analysis of the overall major categories
with total students, gender breakdowns of each category.
Initial Hypotheses
The Social Sciences and Humanities will be the overall top
categories as they are by far the most broad in scope, hence more majors
would come under their umbrella. This would mean they would be
over-represented in the data.
The Engineering, Computer Science, and Physical Sciences major
categories are expected to be less popular because of their demanding
coursework requirements
Women will be over-represented in the Social Sciences and
Humanities, while Men will be over-represented in Engineering and
Physical Science categories.
Most Popular Major Categories
Findings
Business is by far the most popular major category with
approximately 1.3 million total students
Humanities and Liberal Arts is the second highest (~713,000) with
noticeably more students than the third most popular category Education
(~559,000)
Engineering being fourth in popularity was surprising to me, as I
had expected less students to choose it due to the intensity of
coursework
Social Science is still in the Top 5 major categories but lacking
in popularity compared to Business and the Humanities
Interdisciplinary majors are the least popular, with only ~12,300
students
Major Category Gender Distribution
Findings
I was surprised that Business was quite equivalent, as I had
thought more men would gravitate towards it than women
My guess that women would be over-represented in the Humanities
was correct (~441,000 Women to ~273,000 Men)
The split in the Social Science category was far more even than I
thought it would be (~273,000 Women to ~257,000 Men)
I was correct in guessing that men would be over-represented in
Engineering (~408,000 Men to ~129,000 Women)
I was surprised by the relatively even split in the Physical
Sciences (~95,000 Men to ~90,000 Women), as I had guessed that men would
be over-represented in this field
Individual Major Analysis
The following will be an analysis of the most popular majors, with
total students and gender breakdown.
Initial Hypotheses
The majors from the Social Sciences or Humanities category will
likely take up the most spots on the Top 15
The majors from the Engineering category will be underrepresented
on the chart with only one or two being in the Top 15
Due to how unpopular the Computers and Mathematics major category
was, I believe that none of those majors will be seen in the top
15
I feel that women will hold a majority of the top majors, due to
the relatively even gender split in the Business major category, along
with the majorities they hold in the Humanities, Education, Psychology
and Social Work, and Health categories
Top 15 Majors Based on Total Students
Findings
Psychology is by far the most popular major with ~394,000
students
Contrary to my expectations, majors from the Social Sciences and
Humanities did not hold a majority of the top 15 positions
There was more variety in the major category of the top 15
majors, but Business stands out with 5 positions out of 15, the most for
any single major category
- This makes sense seeing as Business is the most popular major
category among students
As I expected, no major from Computers and Mathematics were
present in the top 15
I did not expect that no Engineering majors made it into the top
15 seeing as Engineering is the fourth most popular major
category.
Gender Distribution for Top 15 Majors
Findings
- Women are by and large the majority in:
- Psychology (~307,000 Women to ~87,000 Men)
- Communications (~143,000 Women to ~71,000 Men)
- Nursing (~188,000 Women to ~22,000 Men)
- Elementary Education (~158,000 Women to ~13,000 Men)
- General Education (~117,000 Women to ~27,000 Men)
Among the top 15 majors, only in the Finance major do men have a
significant majority (~115,000 Men to ~59,000 Women)
The numbers are relatively even in the remaining majors, with Men
and Women trading majority by small amounts
I had guessed that women would hold the majority among the most
popular majors, and I was correct in my assumption
- Psychology and Social Work (Psychology), Health (Nursing), and
Education (Elementary Education and General Education) were the specific
fields I noted as most likely being mainly comprised of women
Central Limit Theorem
Here we will take a look into the median earnings distribution for
the whole data set, along with the applicability of the Central Limit
Theorem.
Differing Sample Sizes
This section will show the plots of different sample sizes and how
they might affect the mean of the
## Sample Size: 10 Sample Mean: 40272.24 SD: 3701.887
## Sample Size: 20 Sample Mean: 40245.49 SD: 2669.355
## Sample Size: 30 Sample Mean: 40199.46 SD: 2194.172
## Sample Size: 40 Sample Mean: 40183.79 SD: 1874.693
Findings
As can be seen in the sample size plots above, the plots all look
very similar to the overall median earning distribution with the average
median earnings being $40,151 . We can see that as the sample size
increases, we notice that the sample mean distribution stays roughly the
same and it moves towards the population mean of the data set.
Differing Sampling Method
This section will
## Overall Population Sample Mean: 40151 SD: 11470
## Simple Random Sampling Sample Mean: 40658 SD: 12365
## Stratified Sampling Sample Mean: 40588 SD: 11665
## Systematic Sampling Sample Mean: 40280 SD: 11457
Findings
From the above charts with the different sampling methods we can see
that with different sampling methods, we still see the mean sticking
right around the $40,000 range.
Data Wrangling
This is the section where we use data wrangling techniques to analyze
our dataset
Findings
From the chart above, we can see that Nuclear Engineering majors have
the highest unemployment rate (~0.178), which makes sense due to the
highly specialized nature of the field and what would most likely be in
a market with limited opportunities. I was surprised to see that
Mathematics and Computer Science majors had practically no unemployment,
but with two very intensive quantitative STEM majors, a lot of
opportunities would be available in many different fields. It looks like
traditional Engineering majors face far less unemployment rates than
those in Computers and Mathematics majors. I would have suspected the
opposite to be true, given that software development jobs are
ubiquitous, and seeing Computer Programming and Data Processing, so high
up at a time when the Data Science craze was kicking off was
surprising.